KEYWORD EXTRACTION FROM A SINGLE DOCUMENT USING WORD CO-OCCURRENCE STATISTICAL INFORMATION

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Keyword Extraction from a Single Document using Word Co-occurrence Statistical Information

We present a new keyword extraction algorithm that applies to a single document without using a corpus. Frequent terms are extracted first, then a set of cooccurrence between each term and the frequent terms, i.e., occurrences in the same sentences, is generated. Co-occurrence distribution shows importance of a term in the document as follows. If probability distribution of co-occurrence betwee...

متن کامل

Keyword Extraction from a Single Document Using Centrality Measures

Keywords characterize the topics discussed in a document. Extracting a small set of keywords from a single document is an important problem in text mining. We propose a hybrid structural and statistical approach to extract keywords. We represent the given document as an undirected graph, whose vertices are words in the document and the edges are labeled with a dissimilarity measure between two ...

متن کامل

Segmented Spoken Document Retrieval Using Word Co-occurrence Information

This paper shows several approaches for NTCIR-11 SpokenQuery&Doc [1]. This paper proposes several schemes to use word co-occurrence information for spoken document retrieval. Automatic transcriptions of spoken documents usually contain mis-recognized words, making the performance of spoken document retrieval signi cantly decrease. The cosine similarity to measure a document similarity must be i...

متن کامل

Single Document Keyphrase Extraction Using Label Information

Keyphrases have found wide ranging application in NLP and IR tasks such as document summarization, indexing, labeling, clustering and classification. In this paper we pose the problem of extracting label specific keyphrases from a document which has document level metadata associated with it namely labels or tags (i.e. multi-labeled document). Unlike other, supervised or unsupervised, methods f...

متن کامل

Document Clustering Using Co-word Analysis and Formation of Keyword against Document Matrix

A complexity of the retrieval of relevant document from a large corpus of documents is the most common challenging problem in the areas of web mining and search engines. In addition, the growth of unlabelled and unsupervised documents are also increases this complexity. Document clustering algorithms plays a vital role to reduce this problem. In this paper, an algorithm was proposed to cluster ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal on Artificial Intelligence Tools

سال: 2004

ISSN: 0218-2130,1793-6349

DOI: 10.1142/s0218213004001466